Large deep learning models have achieved remarkable success in many scenarios. However, training large models is usually challenging, e.g., due to the high computational cost, the unstable and painfully slow optimization procedure, and the vulnerability to overfitting. To alleviate these problems, this work studies a divide-and-conquer strategy, i.e., dividing a large model into smaller modules, training them independently, and reassembling the trained modules to obtain the target model. This approach is promising since it avoids directly training large models from scratch. Nevertheless, implementing this idea is non-trivial, as it is difficult to ensure the compatibility of the independently trained modules. In this paper, we present an elegant solution to address this issue, i.e., we introduce a global, shared meta model to implicitly link all the modules together. This enables us to train highly compatible modules that collaborate effectively when they are assembled together. We further propose a module incubation mechanism that enables the meta model to be designed as an extremely shallow network. As a result, the additional overhead introduced by the meta model is minimalized. Though conceptually simple, our method significantly outperforms end-to-end (E2E) training in terms of both final accuracy and training efficiency. For example, on top of ViT-Huge, it improves the accuracy by 2.7% compared to the E2E baseline on ImageNet-1K, while saving the training cost by 43% in the meantime. Code is available at https://github.com/LeapLabTHU/Model-Assembling.
translated by 谷歌翻译
无线技术的最新进步使连接的自动驾驶汽车(CAV)能够通过车辆到车辆(V2V)通信收集有关其环境的信息。在这项工作中,我们为CAVS设计了基于信息共享的多代理增援学习(MARL)框架,以在做出决定以提高交通效率和安全性时利用额外的信息。我们提出的安全参与者批评算法有两种新技术:截断的Q功能和安全动作映射。截断的Q功能利用了来自相邻骑士的共享信息,以使Q-功能的联合状态和动作空间在我们的算法中不会在大型CAV系统中生长。我们证明了截短Q和全局Q函数之间近似误差的结合。安全的操作映射为基于控制屏障功能的培训和执行提供了可证明的安全保证。我们使用CARLA模拟器进行实验,我们表明我们的方法可以在不同的CAV比和不同的交通密度下的平均速度和舒适性方面提高CAV系统的效率。我们还表明,我们的方法避免执行不安全的动作,并始终保持与其他车辆的安全距离。我们构建了一个障碍物的场景,以表明共同的愿景可以帮助骑士早些时候观察障碍,并采取行动避免交通拥堵。
translated by 谷歌翻译
随着信息技术的快速发展,在线平台(例如,新闻门户网站和社交媒体)每时每刻都会产生巨大的网络信息。因此,从社会流中提取结构化的事件表现至关重要。通常,现有事件提取研究利用模式匹配,机器学习或深度学习方法来执行事件提取任务。然而,由于汉语的独特特征,中国事件提取的表现并不像英语一样好。在本文中,我们提出了一个综合框架来执行中文事件提取。所提出的方法是一个多通道输入神经框架,它集成了语义特征和语法特征。 BERT架构捕获语义特征。通过分析嵌入嵌入和图形卷积网络(GCN)分别捕获语音(POS)特征和依赖解析(DP)特征的部分。我们还在真实世界数据集中评估我们的模型。实验结果表明,该方法显着优于基准方法。
translated by 谷歌翻译
原油价格预测研究由于其对全球经济的重大影响,从学者和政策制定者引起了巨大的关注。除供需外,原油价格在很大程度上受到各种因素的影响,如经济发展,金融市场,冲突,战争和政治事件。最先前的研究将原油价格预测视为时间序列或计量计量的可变预测问题。虽然最近已经考虑了考虑实时新闻事件的影响,但大多数作品主要使用原始新闻头条或主题模型来提取文本功能,而不会深刻探索事件信息。在这项研究中,提出了一种新的原油价格预测框架,Agesl,用于处理这个问题。在我们的方法中,利用开放域事件提取算法提取底层相关事件,并且文本情绪分析算法用于从大规模新闻中提取情绪。然后,一系列深度神经网络集成了新闻事件特征,感情特征和历史价格特征,以预测未来原油价格。实证实验是在西德克萨斯中间体(WTI)原油价格数据上进行的,结果表明,与几种基准方法相比,我们的方法获得了卓越的性能。
translated by 谷歌翻译
在线新闻建议的一个关键挑战是帮助用户找到他们感兴趣的文章。传统新闻推荐方法通常使用单一新闻信息,这不足以编码新闻和用户表示。最近的研究使用多个频道新闻信息,例如标题,类别和机构,增强新闻和用户表示。然而,这些方法仅使用各种注意机制来熔化多视图嵌入,而不考虑上下文中包含的深度挖掘更高级别的信息。这些方法编码了在Word级别的新闻内容并共同培训了推荐网络中的注意参数,导致培训模型所需的更多Coreas。我们提出了一个事件提取的新闻推荐(EENR)框架,以克服这些缺点,利用事件提取到抽象的更高级别信息。 Eenr还使用两级策略来减少推荐网络后续部分的参数。我们在第一阶段通过外部语料库训练事件提取模块,并将训练型模型应用于新闻推荐数据集,以预测第二阶段的事件级信息,包括事件类型,角色和参数,包括事件类型,角色和参数。然后我们保险熔断多个频道信息,包括活动信息,新闻标题和类别,以编码新闻和用户。对现实世界数据集的广泛实验表明,我们的EENR方法可以有效地提高新闻建议的性能。最后,我们还探讨了利用更高抽象级别信息来替代新闻身体内容的合理性。
translated by 谷歌翻译
社交媒体平台可能为包含仇恨语音的话语提供潜在的空间,甚至更糟糕,可以充当仇恨犯罪的传播机制。联邦调查局的统一犯罪报告(UCR)计划收集仇恨犯罪数据并每年发布统计报告。这些统计数据提供了确定国家仇恨犯罪趋势的信息。统计数据还可以为执法机构提供有价值的整体和战略洞察力,或证明法律制造者为具体的立法。但是,该报告主要在明年发布,落后于许多即时需求。最近的研究主要侧重于社会媒体文本或对确诊犯罪影响的实证研究中的仇恨语音检测。本文提出了一个框架,首先利用文本采矿技术从纽约时报新闻中提取仇恨犯罪事件,然后利用结果促进预测美国国家一级和国家级仇恨犯罪趋势。实验结果表明,随着时间序列或回归方法,我们的方法可以显着提高预测性能,而无需事件相关的因素。我们的框架拓宽了国家级和国家级仇恨犯罪趋势预测的方法。
translated by 谷歌翻译
随着信息技术的快速发展,在线平台已经产生了巨大的文本资源。作为一种特定形式的信息提取(即),事件提取(EE)由于其自动从人类语言提取事件的能力而增加了普及。但是,事件提取有限的文献调查。现有审查工作要么花费很多努力,用于描述各种方法的细节或专注于特定领域。本研究提供了全面概述了最先进的事件提取方法及其从文本的应用程序,包括闭域和开放式事件提取。这项调查的特点是它提供了适度复杂性的概要,避免涉及特定方法的太多细节。本研究侧重于讨论代表作品的常见角色,应用领域,优势和缺点,忽略各个方法的特殊性。最后,我们总结了常见问题,当前解决方案和未来的研究方向。我们希望这项工作能够帮助研究人员和从业者获得最近的事件提取的快速概述。
translated by 谷歌翻译
In this paper, we propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. The spatial alignment of multi-modal tokens is performed implicitly, by encoding the 3D points into multi-modal features. The core design of CMT is quite simple while its performance is impressive. CMT obtains 73.0% NDS on nuScenes benchmark. Moreover, CMT has a strong robustness even if the LiDAR is missing. Code will be released at https://github.com/junjie18/CMT.
translated by 谷歌翻译
Knowledge graphs (KG) have served as the key component of various natural language processing applications. Commonsense knowledge graphs (CKG) are a special type of KG, where entities and relations are composed of free-form text. However, previous works in KG completion and CKG completion suffer from long-tail relations and newly-added relations which do not have many know triples for training. In light of this, few-shot KG completion (FKGC), which requires the strengths of graph representation learning and few-shot learning, has been proposed to challenge the problem of limited annotated data. In this paper, we comprehensively survey previous attempts on such tasks in the form of a series of methods and applications. Specifically, we first introduce FKGC challenges, commonly used KGs, and CKGs. Then we systematically categorize and summarize existing works in terms of the type of KGs and the methods. Finally, we present applications of FKGC models on prediction tasks in different areas and share our thoughts on future research directions of FKGC.
translated by 谷歌翻译
Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.
translated by 谷歌翻译